AITopics | gradient update

Collaborating Authors

gradient update

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

f86c5c4d4dca70d30b1c12a33a2bc1a4-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 08:40:39 GMT

In this supplementary material, we provide more details regarding implementation details in Appendix B, more analysis of ERDA in Appendix C, full experimental results in Appendix D, and studies on parameters in Appendix E.

artificial intelligence, full result, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.99)

Add feedback

Mitigating Forgetting in Online Continual Learning with Neuron Calibration

Neural Information Processing SystemsApr-25-2026, 23:23:19 GMT

This appendix is organized as follows: Section A: the detailed dataset statistics and a summary of model properties w.r.t. We present the details on each dataset in Table 4. Under the online continual setting, the tasks are observed following a fixed order and the data from each task is observed as a (one-pass) stream of samples. The batch size is 10 for all the datasets. We do not randomize the order of tasks or optimize the task orders.

artificial intelligence, benchmark, machine learning, (12 more...)

Neural Information Processing Systems

Country: North America > United States (0.15)

Genre: Instructional Material > Online (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.61)

Add feedback

2376f25ef1725a9e3516ee3c86a59f46-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 22:00:45 GMT

artificial intelligence, machine learning, subspace, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Composing graphical models with neural networks for structured representations and fast inference

Matthew Johnson, David K. Duvenaud, Alex Wiltschko, Ryan P. Adams, Sandeep R. Datta

Neural Information Processing SystemsMar-23-2026, 13:40:23 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding

Neural Information Processing SystemsMar-17-2026, 15:08:11 GMT

Parallel implementations of stochastic gradient descent (SGD) have received significant research attention, thanks to its excellent scalability properties. A fundamental barrier when parallelizing SGD is the high bandwidth cost of communicating gradient updates between nodes; consequently, several lossy compresion heuristics have been proposed, by which nodes only communicate quantized gradients. Although effective in practice, these heuristics do not always guarantee convergence, and it is not clear whether they can be improved. In this paper, we propose Quantized SGD (QSGD), a family of compression schemes for gradient updates which provides convergence guarantees. QSGD allows the user to smoothly trade off \emph{communication bandwidth} and \emph{convergence time}: nodes can adjust the number of bits sent per iteration, at the cost of possibly higher variance. We show that this trade-off is inherent, in the sense that improving it past some threshold would violate information-theoretic lower bounds. QSGD guarantees convergence for convex and non-convex objectives, under asynchrony, and can be extended to stochastic variance-reduced techniques. When applied to training deep neural networks for image classification and automated speech recognition, QSGD leads to significant reductions in end-to-end training time. For example, on 16GPUs, we can train the ResNet152 network to full accuracy on ImageNet 1.8x faster than the full-precision variant.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

KDGAN: Knowledge Distillation with Generative Adversarial Networks

Neural Information Processing SystemsMar-16-2026, 17:26:02 GMT

Knowledge distillation (KD) aims to train a lightweight classifier suitable to provide accurate inference with constrained resources in multi-label learning. Instead of directly consuming feature-label pairs, the classifier is trained by a teacher, i.e., a high-capacity model whose training may be resource-hungry. The accuracy of the classifier trained this way is usually suboptimal because it is difficult to learn the true data distribution from the teacher. An alternative method is to adversarially train the classifier against a discriminator in a two-player game akin to generative adversarial networks (GAN), which can ensure the classifier to learn the true data distribution at the equilibrium of this game. However, it may take excessively long time for such a two-player game to reach equilibrium due to high-variance gradient updates.

artificial intelligence, classifier, machine learning, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback